
Here, we look at data from the stock market, particularly some technology stocks. We will learn how to use pandas to get stock information, visualize different aspects of it, and finally we will look at a few ways of analyzing the risk of a stock, based on its previous performance history.
In this section, handle requesting stock information with pandas, and to analyze basic attributes of a stock.
Start with importing libraries pandas, numpy, matplotlib, seaborn & yfinance (to download data from online platform yahoo).
After that we download the stock data of all required tech companies(apple, google, microsoft, amazon) of 1 year with starting date = '2020-10-28' & ending date = '2021-10-28'. Then combine whole data set of each comapanies using zip() function, also to distinguih the data, a coloumn added which name companies name parallel to the dataset.
Summary Status using describe()
General information using info()
Visualization using historical view of the closing price
Visualization using historical view of the total volume.
Let's move forward and look the moving average for 10, 20 & 50 days of various stocks. Firstly added the column for each moving average including days and parallely it's value via calculating mean of 'Adj Close' column. The following table shows the tail data of 10 rows.
Ploting all the Moving Averages using matplotlib (subplot axes 2*2).
Till now only did some baseline analysis, let's go ahead and dive a little deeper. We're now going to analyze the risk of the stock. In order to do so we'll need to take a closer look at the daily changes of the stock, and not just its absolute value. Let's go ahead and use pandas to retrieve the daily returns for the Apple stock.
Steps will be, creating a new column contains value percentage of 'adj close' value. Then plotting again using subplot axes 2*2. (Here we used line style '--' & marker 'o'.
Below is an overall look at the average daily return using a histogram. We'll use seaborn to create both a histogram and kde plot on the same figure.
Steps will be, usinf foe lopp and plotting subplot, with seaborn (using displot),
Building a DataFrame with all the ['Close'] columns for each of the stocks dataframes.
Getting the daily return for all the stocks (by converting to percentage using pct.change()), like we did for the Apple stock.
Comparing the daily percentage return of two stocks to check how correlated. First looking a sotck compared to itself.
Steps will be, creating joinplot using seaborn under scatter kind category. (google vs google)
Creating joinplot using seaborn under scatter kind category. (google vs microsoft)
As if two stocks are perfectly (and positivley) correlated with each other a linear relationship bewteen its daily return values should occur.
Seaborn and pandas make it very easy to repeat this comparison analysis for every possible combination of stocks in our technology stock ticker list. We can use sns.pairplot() to automatically create this plot
Simply calling pairplot with kind 'reg' on our DataFrame for an automatic visual analysis of all the comparisons
Calling pairplot with kind 'kde' format.
Above did all the relationships on daily returns between all the stocks. A quick glance shows an interesting correlation between Google and Amazon daily returns. It might be interesting to investigate that individual comaprison.
While the simplicity of just calling sns.pairplot() is fantastic we can also use sns.PairGrid() for full control of the figure, including what kind of plots go in the diagonal, the upper triangle, and the lower triangle. Below is an example of utilizing the full power of seaborn to achieve this result
After switching upper triangle and lower triangle
Finally, we could also do a correlation plot, to get actual numerical values for the correlation between the stocks' daily return values. By comparing the closing prices, we see an interesting relationship between Microsoft and Apple.
Using heatmap from seaborn and input value 'tech_rets.corr()' and cmap of 'YlGnBu' .
Using heatmap from seaborn and input value 'df_closing.corr()' and cmap of 'summer' .
As suspected in our PairPlot, here numerically and visually that Microsoft and Amazon had the strongest correlation of daily stock return. It's also interesting to see that all the technology comapnies are positively correlated.
There are many ways to quantify risk, one of the most basic ways using the information we've gathered on daily percentage returns is by comparing the expected return with the standard deviation of the daily returns
Let's start by defining a new DataFrame as a clenaed version of the orignal tech_rets DataFrame (using dropna()).Then plotting scatter with 'expected return' x = rets.mean() & 'risk' y = rets.std() with area = 20*pi using numpy.
Here we use annotate to mark for better version.
After you have the stock market data, the next step is to create trading strategies and analyse the performance. The ease of analysing the performance is the key advantage of the Python.
We will analyse the cumulative returns, drawdown plot, different ratios such as
Sharpe ratio, Sortino ratio, and Calmar ratio.
We see that volume traded and closing price have an inverse relationship. This relationship is a common practice in finance. If the closing price of a stock decreases, people are more likely to trade a particular stock. However, we see that the data is very spiky. This spikiness is because there are subtle market forces that guide the price fluctuations.
Next, we can use an OHLC chart to visualize the data. The OLHC (open, high, low and close) chart is a financial chart describing open, high, low and close values for a given date.
The horizontal segments represent open and close values, and the tip of the lines represents the low and high values. Points, where the close value is higher than open are called increasing (in green) and decreasing close value is lower than open( in red).
